{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Lecture 2: Basic Single & Multi-Table SQL\n",
    "======================"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Connected: @None'"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%load_ext sql\n",
    "%sql sqlite://"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Let's create a table, stuff it with data, and query it!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n",
      "Done.\n",
      "1 rows affected.\n",
      "1 rows affected.\n",
      "1 rows affected.\n",
      "1 rows affected.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql drop table if exists product;\n",
    "create table product(\n",
    "       pname        varchar primary key, -- name of the product\n",
    "       price        money,               -- price of the product\n",
    "       category     varchar,             -- category\n",
    "       manufacturer varchar NOT NULL     -- manufacturer\n",
    ");\n",
    "insert into product values('Gizmo', 19.99, 'Gadgets', 'GizmoWorks');\n",
    "insert into product values('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks');\n",
    "insert into product values('SingleTouch', 149.99, 'Photography', 'Canon');\n",
    "insert into product values('MultiTouch', 203.99, 'Household', 'Hitachi');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Let's look at the products.."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "        <th>category</th>\n",
       "        <th>manufacturer</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Gizmo</td>\n",
       "        <td>19.99</td>\n",
       "        <td>Gadgets</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>PowerGizmo</td>\n",
       "        <td>29.99</td>\n",
       "        <td>Gadgets</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SingleTouch</td>\n",
       "        <td>149.99</td>\n",
       "        <td>Photography</td>\n",
       "        <td>Canon</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>MultiTouch</td>\n",
       "        <td>203.99</td>\n",
       "        <td>Household</td>\n",
       "        <td>Hitachi</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Gizmo', 19.99, 'Gadgets', 'GizmoWorks'),\n",
       " ('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks'),\n",
       " ('SingleTouch', 149.99, 'Photography', 'Canon'),\n",
       " ('MultiTouch', 203.99, 'Household', 'Hitachi')]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%sql select * from product;"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Some *terminology* for SQL.\n",
    "--------------------------\n",
    "* The _name_ of the table is product.\n",
    "* Each row of the table is called a _row_ or a _tuple_. \n",
    "* Notice all tuples have the fields or _attributes_.\n",
    "* The number of rows is called the _cardinality_ while the number of attributes is called the _arity_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Schema Conventions\n",
    "-----------------\n",
    "* The schema of product is written as follows:\n",
    "\n",
    "> product(<u>pname</u>, price, category, manufacturer)\n",
    "\n",
    "Underlining a set of attributes indicates that they form a _key_.\n",
    "\n",
    "* In this case, pname is a key. If the product name was only unique for a given manufacturer, we'd write:\n",
    "\n",
    "> product(<u>pname</u>, price, category, <u>manufacturer</u>)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Tables Explained\n",
    "----------------\n",
    "* A tuple = a record\n",
    "  * Restriction: all attributes are of atomic type\n",
    "  * There are many atomic data types in SQL engines, look [here](http://www.postgresql.org/docs/9.4/static/datatype.html) for example.\n",
    "\n",
    "\n",
    "* A table = a (multi)-set of tuples\n",
    "  * A multiset is like a list…\n",
    "  * ... but a mutiset is unordered: \n",
    "    * no first(), no next(), no last()."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "# Outline\n",
    "* Create a database -- done!\n",
    "\n",
    "* Simple querying -- now!\n",
    "\n",
    "* Queries with more than one relation -- next!\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Let the (Simple) querying begin! \n",
    "* We'll introduce the basics of SQL by example.\n",
    "* There are many good SQL tutorials on the web, this is intended to get you started."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "> SELECT (attributes)<br>\n",
    "> FROM (one or more tables)<br>\n",
    "> WHERE (conditions)\n",
    "\n",
    "This is the simple SELECT-FROM-WHERE (SFW) block. Let's see some examples!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "        <th>category</th>\n",
       "        <th>manufacturer</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>PowerGizmo</td>\n",
       "        <td>29.99</td>\n",
       "        <td>Gadgets</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks')]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "SELECT * from Product \n",
    "WHERE category='Gadgets' and price > 20.0;"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Let's give an example of *projection*, i.e., we only retain some attributes from the query. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "        <th>manufacturer</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Gizmo</td>\n",
       "        <td>19.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>PowerGizmo</td>\n",
       "        <td>29.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SingleTouch</td>\n",
       "        <td>149.99</td>\n",
       "        <td>Canon</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>MultiTouch</td>\n",
       "        <td>203.99</td>\n",
       "        <td>Hitachi</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Gizmo', 19.99, 'GizmoWorks'),\n",
       " ('PowerGizmo', 29.99, 'GizmoWorks'),\n",
       " ('SingleTouch', 149.99, 'Canon'),\n",
       " ('MultiTouch', 203.99, 'Hitachi')]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql \n",
    "SELECT Pname, Price, Manufacturer\n",
    "FROM Product;"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "* The output is *still* a table, and its schema is \n",
    "> Answer(pname, price, manufacturer)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "* Of course, we can combine selection and projection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "        <th>manufacturer</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Gizmo</td>\n",
       "        <td>19.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>PowerGizmo</td>\n",
       "        <td>29.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Gizmo', 19.99, 'GizmoWorks'), ('PowerGizmo', 29.99, 'GizmoWorks')]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "SELECT Pname, Price, Manufacturer\n",
    "FROM Product\n",
    "WHERE category='Gadgets';"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "The output of a query on a table is again a table \n",
    "----------------------------------------------\n",
    "* This is because the query language is *compositional*\n",
    "* The output of a query really is a table!\n",
    "* look at this crazy query, what does it ask for?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>manufacturer</th>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>GizmoWorks</td>\n",
       "        <td>PowerGizmo</td>\n",
       "        <td>29.99</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('GizmoWorks', 'PowerGizmo', 29.99)]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "SELECT * FROM Product;\n",
    "\n",
    "SELECT\n",
    "    p.manufacturer, p.pname, p.price\n",
    "FROM \n",
    "    (SELECT distinct p0.Manufacturer\n",
    "     FROM Product p0\n",
    "     WHERE p0.price < 20.00) cp, -- this is a nested query!\n",
    "    Product p\n",
    "WHERE \n",
    "    p.manufacturer = cp.manufacturer and p.price > 20.00"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Details on SQL\n",
    "--------------\n",
    "\n",
    "* Some elements are case insensitive (think: program):\n",
    "  * Same: SELECT  Select  select\n",
    "  * Same: Product   product\n",
    "  * Different: ‘Seattle’  ‘seattle’\n",
    "  \n",
    "\n",
    "* Constants (single quotes)\n",
    "  * ‘abc’  - yes\n",
    "  * “abc” - no\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "LIKE\n",
    "====\n",
    "\n",
    "The LIKE operator is to search strings, perhaps with wildcards. Format is:\n",
    "    \n",
    "> SELECT *\n",
    "> FROM Products\n",
    "> WHERE pname like '%gizmo%'\n",
    "\n",
    "* % matches any number of characters\n",
    "* \\_ matches one character\n",
    "* The like operator is case sensitive\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "        <th>category</th>\n",
       "        <th>manufacturer</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Gizmo</td>\n",
       "        <td>19.99</td>\n",
       "        <td>Gadgets</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>PowerGizmo</td>\n",
       "        <td>29.99</td>\n",
       "        <td>Gadgets</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Gizmo', 19.99, 'Gadgets', 'GizmoWorks'),\n",
       " ('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks')]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "SELECT *  FROM product\n",
    "where pname LIKE '%Gizmo%'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Eliminating Duplicates\n",
    "---------------------\n",
    "* duplicates can sometimes be unwelcome or suprising. \n",
    "  * Recall tables are _multisets_!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>category</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Gadgets</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Gadgets</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Photography</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Household</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Gadgets',), ('Gadgets',), ('Photography',), ('Household',)]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%sql SELECT category from product;"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>category</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Gadgets</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Photography</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Household</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Gadgets',), ('Photography',), ('Household',)]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql \n",
    "-- easy to remove duplicates, use the distinct keyword\n",
    "SELECT DISTINCT category from product;"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Ordering the results\n",
    "---------------------\n",
    "* Sometimes you want the results ordered, let's see some examples!\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "        <th>manufacturer</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SingleTouch</td>\n",
       "        <td>149.99</td>\n",
       "        <td>Canon</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>MultiTouch</td>\n",
       "        <td>203.99</td>\n",
       "        <td>Hitachi</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('SingleTouch', 149.99, 'Canon'), ('MultiTouch', 203.99, 'Hitachi')]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "-- sometimes we want to order the results.\n",
    "-- order by is ascending by default!\n",
    "SELECT   pname, price, manufacturer\n",
    "FROM     Product\n",
    "WHERE    price > 50\n",
    "ORDER BY  price, pname"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>price</th>\n",
       "        <th>manufacturer</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>149.99</td>\n",
       "        <td>Canon</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>29.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>19.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>203.99</td>\n",
       "        <td>Hitachi</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[(149.99, 'Canon'),\n",
       " (29.99, 'GizmoWorks'),\n",
       " (19.99, 'GizmoWorks'),\n",
       " (203.99, 'Hitachi')]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "-- sometimes we want to order the results.\n",
    "-- can order like so, each component individually\n",
    "SELECT   price, manufacturer\n",
    "FROM     Product\n",
    "-- the order is \"dictionary order\" in the clause.\n",
    "ORDER BY   manufacturer ASC, price DESC"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Joins!\n",
    "------\n",
    "Let's illustrate some more complex queries that join two tables together.\n",
    "\n",
    "* Consider a table of companies, stock price, and HQ country.\n",
    "> company(<u>cname</u>, stockprice, country)\n",
    " \n",
    "* we'll then revist products and introduce some consistency requirements "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n",
      "Done.\n",
      "Done.\n",
      "1 rows affected.\n",
      "1 rows affected.\n",
      "1 rows affected.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "drop table if exists product; -- This needs to be dropped if exists, see why further down!\n",
    "drop table if exists company;\n",
    "create table company (\n",
    "    cname varchar primary key, -- company name uniquely identifies the company.\n",
    "    stockprice money, -- stock price is in money \n",
    "    country varchar); -- country is just a string\n",
    "insert into company values ('GizmoWorks', 25.0, 'USA');\n",
    "insert into company values ('Canon', 65.0, 'Japan');\n",
    "insert into company values ('Hitachi', 15.0, 'Japan');"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>cname</th>\n",
       "        <th>stockprice</th>\n",
       "        <th>country</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>GizmoWorks</td>\n",
       "        <td>25</td>\n",
       "        <td>USA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Canon</td>\n",
       "        <td>65</td>\n",
       "        <td>Japan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Hitachi</td>\n",
       "        <td>15</td>\n",
       "        <td>Japan</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('GizmoWorks', 25, 'USA'), ('Canon', 65, 'Japan'), ('Hitachi', 15, 'Japan')]"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%sql select * from company;"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Foreign Key Constraints\n",
    "-----------------------\n",
    "* Suppose that we want to create a products table\n",
    "\n",
    "> Product(pname, price, category, manufacturer)\n",
    "\n",
    "* Something is odd here: We can have manufacturers that sell products but don't occur in our company table!\n",
    "* To protect against, this we introduce _foreign keys_ "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We say the company name in products _refers_ to cname in company. Let's do it in SQL! the key statement below is:\n",
    "\n",
    "> foreign key (manufacturer) references company(cname)\n",
    "\n",
    "  * Note that cname must be a key in company! \n",
    "  * Keys and Foreign keys come up _all_ the time. \n",
    "    * PKs and FKs are common (others, less so)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n",
      "Done.\n",
      "Done.\n",
      "1 rows affected.\n",
      "1 rows affected.\n",
      "1 rows affected.\n",
      "1 rows affected.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql drop table if exists product;\n",
    "pragma foreign_keys = ON; -- WARNING by default off in sqlite\n",
    "create table product(\n",
    "       pname varchar primary key, -- name of the product\n",
    "       price money, -- price of the product\n",
    "       category varchar, -- category\n",
    "       manufacturer varchar, -- manufacturer\n",
    "       foreign key (manufacturer) references company(cname));\n",
    "\n",
    "insert into product values('Gizmo', 19.99, 'Gadgets', 'GizmoWorks');\n",
    "insert into product values('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks');\n",
    "insert into product values('SingleTouch', 149.99, 'Photography', 'Canon');\n",
    "insert into product values('MultiTouch', 203.99, 'Household', 'Hitachi');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Indeed foreign keys are a _constraint_ \n",
    "> What happens if we introduce a company name not in our table?\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "(sqlite3.IntegrityError) UNIQUE constraint failed: product.pname\n",
      "[SQL: insert into product values('MultiTouch', 203.99, 'Household', 'Google');]\n",
      "(Background on this error at: http://sqlalche.me/e/gkpj)\n",
      "Rejected!\n"
     ]
    }
   ],
   "source": [
    "try:\n",
    "    %sql insert into product values('MultiTouch', 203.99, 'Household', 'Google');\n",
    "except Exception as e:\n",
    "    print(e) \n",
    "    print(\"Rejected!\") "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "        <th>category</th>\n",
       "        <th>manufacturer</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Gizmo</td>\n",
       "        <td>19.99</td>\n",
       "        <td>Gadgets</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>PowerGizmo</td>\n",
       "        <td>29.99</td>\n",
       "        <td>Gadgets</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SingleTouch</td>\n",
       "        <td>149.99</td>\n",
       "        <td>Photography</td>\n",
       "        <td>Canon</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>MultiTouch</td>\n",
       "        <td>203.99</td>\n",
       "        <td>Household</td>\n",
       "        <td>Hitachi</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Gizmo', 19.99, 'Gadgets', 'GizmoWorks'),\n",
       " ('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks'),\n",
       " ('SingleTouch', 149.99, 'Photography', 'Canon'),\n",
       " ('MultiTouch', 203.99, 'Household', 'Hitachi')]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "-- the update is rejected!\n",
    "select * from product;"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Foreign Keys and Delete\n",
    "=============\n",
    "\n",
    "* What happens if we delete a company? Three options:\n",
    "  * Disallow the delete. (default)\n",
    "  * Remove all products (add \"`on delete cascade`\")\n",
    "  * A third variant due to NULL\n",
    "  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**First option (default)- delete is disallowed**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "(sqlite3.IntegrityError) FOREIGN KEY constraint failed\n",
      "[SQL: delete from company where cname = 'Hitachi';]\n",
      "(Background on this error at: http://sqlalche.me/e/gkpj)\n",
      "Disallowed!\n"
     ]
    }
   ],
   "source": [
    "try:\n",
    "    %sql delete from company where cname = 'Hitachi';\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    print(\"Disallowed!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Second option: remove all products belonging to the company we delete**\n",
    "\n",
    "Try adjusting the foreign key constraint clause when you create the products table as follows:\n",
    "> foreign key (manufacturer) references company(cname) on delete cascade\n",
    "\n",
    "Now, when a company row is deleted, all of the products linked by the foreign key constraint will be deleted as well."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Joins!\n",
    "------\n",
    "> Product (<u>pname</u>,  price, category, manufacturer)<br>\n",
    "> Company (<u>cname</u>, stockPrice, country)\n",
    "\n",
    "The query we want to answer is:\n",
    "\n",
    "> Find all products under $200 manufactured in Japan;\n",
    "> return their names and prices. \n",
    "\n",
    "Notice products don't have a location and manufacturers don't have price. Need info in _each_ of the tables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SingleTouch</td>\n",
       "        <td>149.99</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('SingleTouch', 149.99)]"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "SELECT pname, price\n",
    "FROM product, company\n",
    "where manufacturer=cname and country='Japan' and price <= 200;"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's see how to write this join in a modular way."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>cname</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Canon</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Hitachi</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Canon',), ('Hitachi',)]"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql -- Part 1: Set of Japanese companies.\n",
    "SELECT distinct cname -- do we need distinct?\n",
    "from company where country='Japan';"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "        <th>manufacturer</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Gizmo</td>\n",
       "        <td>19.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>PowerGizmo</td>\n",
       "        <td>29.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SingleTouch</td>\n",
       "        <td>149.99</td>\n",
       "        <td>Canon</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Gizmo', 19.99, 'GizmoWorks'),\n",
       " ('PowerGizmo', 29.99, 'GizmoWorks'),\n",
       " ('SingleTouch', 149.99, 'Canon')]"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql -- Part 2: Cheap Products (under $200)\n",
    "select distinct pname, price, manufacturer\n",
    "from product\n",
    "where price <= 200;"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "        <th>manufacturer</th>\n",
       "        <th>cname</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Gizmo</td>\n",
       "        <td>19.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "        <td>Canon</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Gizmo</td>\n",
       "        <td>19.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "        <td>Hitachi</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>PowerGizmo</td>\n",
       "        <td>29.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "        <td>Canon</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>PowerGizmo</td>\n",
       "        <td>29.99</td>\n",
       "        <td>GizmoWorks</td>\n",
       "        <td>Hitachi</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SingleTouch</td>\n",
       "        <td>149.99</td>\n",
       "        <td>Canon</td>\n",
       "        <td>Canon</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SingleTouch</td>\n",
       "        <td>149.99</td>\n",
       "        <td>Canon</td>\n",
       "        <td>Hitachi</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Gizmo', 19.99, 'GizmoWorks', 'Canon'),\n",
       " ('Gizmo', 19.99, 'GizmoWorks', 'Hitachi'),\n",
       " ('PowerGizmo', 29.99, 'GizmoWorks', 'Canon'),\n",
       " ('PowerGizmo', 29.99, 'GizmoWorks', 'Hitachi'),\n",
       " ('SingleTouch', 149.99, 'Canon', 'Canon'),\n",
       " ('SingleTouch', 149.99, 'Canon', 'Hitachi')]"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql -- combine them with nested SFW queries... this is a cross product?\n",
    "SELECT * \n",
    "FROM \n",
    "  (SELECT DISTINCT pname, price, manufacturer\n",
    "   FROM product\n",
    "   WHERE price <= 200) CheapProducts,\n",
    "  (SELECT DISTINCT cname\n",
    "   FROM company\n",
    "   WHERE country='Japan') JapaneseProducts;"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>pname</th>\n",
       "        <th>price</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SingleTouch</td>\n",
       "        <td>149.99</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('SingleTouch', 149.99)]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "-- Combine them as a join!\n",
    "SELECT DISTINCT pname, price\n",
    "FROM \n",
    "  (SELECT DISTINCT pname, price, manufacturer\n",
    "   FROM product\n",
    "   WHERE price <= 200) CheapProducts,\n",
    "  (SELECT distinct cname\n",
    "   FROM company\n",
    "   WHERE country='Japan') JapaneseProducts\n",
    "WHERE cname = manufacturer;"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Takeways\n",
    "--------\n",
    "* There are potentially _many logically equivalent ways_ to write a query\n",
    "    * This fact will be used later by the query optimizer and in homework!\n",
    "    * On exams, write the simplest thing (break it down in parts?)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Duplicate answers from join\n",
    "--------------------------\n",
    "\n",
    "Note that we can get duplicate answers from a join..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * sqlite://\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>country</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>USA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>USA</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('USA',), ('USA',)]"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql -- duplicate answer\n",
    "SELECT Country\n",
    "FROM Product, Company\n",
    "WHERE  Manufacturer=CName AND Category='Gadgets';"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
